Evaluation and collection of proper name pronunciations online

نویسندگان

  • Ariadna Font Llitjós
  • Alan W. Black
چکیده

Objective evaluation allows a model to be compared with other similar models. However, automatic pronunciation models should also be extensively evaluated by humans, since the ultimate goal of any pronunciation model is to produce an accurate pronunciation as judged by most people. This paper describes an initiative to evaluate and collect proper name pronunciations online, the development of the US Pronunciation of Proper Names Site (www.pronounce-names.org), and the results obtained so far. The internet, through our web-based interface, has already proven to be a very successful medium both in terms of number of evaluations and in terms of data collection. In 5 weeks, it has brought to our site 601 users, which have evaluated 477 names and corrected 281 pronunciations. The information gathered is useful to improve our pronunciation models, as well as to (automatically) correct the pronunciations in the CMU dictionary.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative objective and subjective evaluation of three data-driven techniques for proper name pronunciation

Automatic pronunciation of unknown words is a hard problem of great importance in speech technology. Proper names constitute an especially difficult class of words to pronounce because of their low frequency of occurrence and variable origin. In this paper, we compare three different data-driven approaches which use a dictionary of (known) proper names to infer pronunciations for unknown names,...

متن کامل

Learning linguistically valid pronunciations from acoustic data

We describe an algorithm to learn word pronunciations from acoustic data. The algorithm jointly optimizes the pronunciation of a word using (a) the acoustic match of this pronunciation to the observed data, and (b) how “linguistically reasonable” the pronunciation is. Variations of word pronunciations in the recognition dictionary (which was created by linguists), are used to train a model of w...

متن کامل

Improving Proper Name Recognition by Adding Automatically Learned Pronunciation Variants to the Lexicon

This paper deals with the task of large vocabulary proper name recognition. In order to accomodate a wide diversity of possible name pronunciations (due to non-native name origins or speaker tongues) a multilingual acoustic model is combined with a lexicon comprising 3 grapheme-to-phoneme (G2P) transcriptions (from G2P transcribers for 3 different languages) and up to 4 so-called phoneme-tophon...

متن کامل

Learning Linguistically Valid Pronun

We describe an algorithm to learn word pronunciations from acoustic data. The algorithm jointly optimizes the pronunciation of a word using (a) the acoustic match of this pronunciation to the observed data, and (b) how “linguistically reasonable” the pronunciation is. Variations of word pronunciations in the recognition dictionary (which was created by linguists), are used to train a model of w...

متن کامل

Word Pronunciation Disambiguation using the Web

This paper proposes an automatic method of reading proper names with multiple pronunciations. First, the method obtains Web pages that include both the proper name and its pronunciation. Second, the method feeds them to the learner for classification. The current accuracy is around 90% for open data.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002